Callback verification
Callback verification, also known as callout verification or Sender Address Verification, is a technique used by SMTP software in order to validate e-mail addresses. The most common target of verification is the sender address from the message envelope (the address specified during the SMTP dialogue as "MAIL FROM"). It is mostly used as an anti-spam measure.
Purpose
Since a large percentage of e-mail spam has forged sender ("mfrom") addresses, some spam can be detected by checking whether forging resulted in an invalid address, using this method.
A related technique is "call forwards", in which a secondary or firewall mail exchanger can verify recipients at the primary mail exchanger for the domain in order to decide whether the address is deliverable.
Process
The receiving mail server verifies the sender address by making an SMTP connection to the mail exchanger for it (found via the usual MX records), pretending to be creating a bounce, but stopping just before any e-mail is sent. The commands sent out are:
HELO <verifier host name>
MAIL FROM:<>
RCPT TO:<the address to be tested>
QUIT
Equivalently, the MAIL FROM and RCPT TO commands can be replaced by the VRFY command, however the VRFY command is not required to be supported and is usually disabled in modern MTAs.
Both of these techniques are technically compliant with the relevant SMTP RFCs (RFC 5321), however RFC 2505 (a Best Current Practice) recommends, by default, disabling the VRFY command to prevent directory harvest attacks. (One widespread interpretation implies that the MAIL FROM/RCPT TO pair of commands should also respond the same way, but this is not stated by the RFCs.)
To work around a limitation (see below), the MAIL FROM can be non-blank.
Limitations
The documentation for both postfix and exim caution against the use[1][2] of this technique and mention many limitations to SMTP callbacks. In particular, there are many situations where it is either ineffective or causes problems to the systems that receive the callbacks.
- Some regular mail exchangers do not give useful results to callbacks:
- Servers that reject all bounce mails (contrary to the RFCs). To work around this problem, postfix, for example, uses either the local postmaster address or an address of "double-bounce" in the MAIL FROM part of the callout. This work-around, however, will fail if Bounce Address Tag Validation is used to reduce backscatter.[3] Callback verification can still work if rejecting all bounces happens at the DATA stage instead of the earlier MAIL FROM stage, while rejecting invalid e-mail addresses remains at the RCPT TO stage instead of also being moved to the DATA stage.[1][2]
- Servers that accept all e-mail address at RCPT TO stage but reject invalid ones at DATA stage. This is commonly done in order to prevent directory harvest attacks and will, by design, give no information about whether an e-mail address is valid and thus prevent callback verification from working.[1][3]
- Servers that accept all mails during the SMTP dialogue (and generate their own bounces later).[1] This problem can be alleviated by testing a random non-existent address as well as the desired address (if the test succeeds, further verification is useless).
- Servers that implement catch-all e-mail will, by definition, consider all e-mail addresses to be valid and accept them. Like systems that accept-then-bounce, a random non-existent address can detect this.
- The callback process can cause delays in delivery because the mail server where an address is verified may use slow anti-spam techniques, including "greet delays" (causing a connection delay) and greylisting (causing a verification deferral).[1]
- If the system being called back to uses greylisting the callback may return no useful information until the greylisting time has expired. Greylisting works by returning a "temporary failure" (a 4xx response code) when it sees an unfamiliar MAIL FROM/RCPT TO pair of email addresses. A greylisting system may not give a "permanent failure" (a 5xx response code) when given an invalid e-mail address for the RCPT TO, and may instead continue to return a 4xx response code.[4]
- Some e-mail may be legitimate but not have a valid "envelope from" address due to user error or just misconfiguration. The positive aspect is that the verification process will usually cause an outright rejection, so if the sender was not a spammer but a real user, they will be notified of the problem.
- If a server receives a lot of spam may do a lot of callbacks. If those addresses are invalid or spamtrap, the server will look very similar to a spammer who is doing a dictionary attack to harvest addresses. This in turn might get the server blacklisted elsewhere.[1][3][5]
- Every callback places an unasked for burden on the system being called back to, with very few effective ways for that system to avoid the burden. In extreme cases, if a spammer abuses the same sender address and uses it at a sufficiently diverse set of receiving MXs, all of which use this method, they might all try the callback, overloading the MX for the forged address with requests (effectively a Distributed Denial of Service attack).[3]
- Callback verification has no effect if spammers spoof real email addresses[3][6] or use the null bounce address.
Some of these problems are caused by originating systems violating or stretching the limits of RFCs; verification problems are only reflecting these problems back to the senders, like unintentionally used invalid addresses, rejection of the null sender, or greylisting (where, for example, the delay caused by the verifying recipient is closely related to the delay caused by the originator). In many cases this in turn helps originator system to detect the problems, and fix them (like unintentionally not being able to receive valid bounces).
Several of the above problems are reduced by caching of verification results. In particular, systems that give no useful information (not rejecting at the RCPT TO time, have catch-all e-email, etc.) can be remembered and no future call backs to those systems need to be made. Also, results (positive or negative) for specific e-mail addressas can be remembered. MTAs like Exim have caching built in.[2]
References
External links